Skip to content

Add async fallback support for missing LLM attributes in LlamaIndex #282

Merged
shuningc merged 5 commits intomainfrom
HYBIM-620-async-fallbacks
Apr 23, 2026
Merged

Add async fallback support for missing LLM attributes in LlamaIndex #282
shuningc merged 5 commits intomainfrom
HYBIM-620-async-fallbacks

Conversation

@shuningc
Copy link
Copy Markdown
Contributor

@shuningc shuningc commented Apr 20, 2026

In async agent flows (ReActAgent using astream_chat), LlamaIndex stores tool definitions on the Agent object rather than in the LLM callback payload. The ContextVar tracking the current agent may not propagate across asyncio.Task boundaries, so multiple fallback strategies are needed.

Changes:

  • Extract tool_definitions from agent context with fallback chain: serialized payload -> parent agent ContextVar -> find_agent_with_tools()
  • Extract response_model from raw response with fallback to request model
  • Extract finish_reasons from raw response choices
  • Add max_tokens fallback chain (serialized -> metadata -> Settings.llm)
  • Detect provider from LLM class name
  • Register agent + tools in wrap_agent_run() before wrapped() is called

Parent PR #274
Sample output:

[
  {
    "name": "chat gpt-5-nano",
    "trace_id": "45490d2d5facbaf466c13cb10f004a24",
    "span_id": "bd0fc54070d0210a",
    "parent_span_id": "e6a83bcb669fe6bc",
    "attributes": {
      "gen_ai.framework": "llamaindex",
      "gen_ai.tool.definitions": "[{\"type\": \"function\", \"name\": \"get_weather\", \"description\": \"get_weather(city: str) -> str\\nGet the current weather for a city.\"}, {\"type\": \"function\", \"name\": \"get_time\", \"description\": \"get_time(timezone: str) -> str\\nGet the current time in a timezone.\"}, {\"type\": \"function\", \"name\": \"calculate\", \"description\": \"calculate(expression: str) -> str\\nCalculate a math expression.\"}]",
      "gen_ai.evaluation.sampled": true,
      "gen_ai.evaluation.error": "None",
      "gen_ai.input.messages": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"What is 2 + 2?\"}]}]",
      "gen_ai.agent.name": "ReActAgent",
      "gen_ai.agent.id": "e6a83bcb669fe6bc",
      "gen_ai.request.model": "gpt-5-nano",
      "gen_ai.operation.name": "chat",
      "gen_ai.response.model": "gpt-5-nano-2025-08-07",
      "gen_ai.usage.input_tokens": 640,
      "gen_ai.usage.output_tokens": 1057,
      "gen_ai.request.temperature": 0.0,
      "gen_ai.request.max_tokens": 4096,
      "gen_ai.response.finish_reasons": ["stop"],
      "gen_ai.workflow.name": "ReActAgent",
      "gen_ai.output.messages": "[{\"role\": \"assistant\", \"parts\": [{\"type\": \"text\", \"content\": \"Thought: I can answer without using any more tools. I'll use the user's language to answer\\nAnswer: 4\"}], \"finish_reason\": \"stop\"}]"
    }
  },
  {
    "name": "chat gpt-5-nano",
    "trace_id": "45490d2d5facbaf466c13cb10f004a24",
    "span_id": "8892d7e048cacc37",
    "parent_span_id": "e6a83bcb669fe6bc",
    "attributes": {
      "gen_ai.framework": "llamaindex",
      "gen_ai.tool.definitions": "[{\"type\": \"function\", \"name\": \"get_weather\", \"description\": \"get_weather(city: str) -> str\\nGet the current weather for a city.\"}, {\"type\": \"function\", \"name\": \"get_time\", \"description\": \"get_time(timezone: str) -> str\\nGet the current time in a timezone.\"}, {\"type\": \"function\", \"name\": \"calculate\", \"description\": \"calculate(expression: str) -> str\\nCalculate a math expression.\"}]",
      "gen_ai.evaluation.sampled": true,
      "gen_ai.evaluation.error": "None",
      "gen_ai.input.messages": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"What is 2 + 2?\"}]}]",
      "gen_ai.agent.name": "ReActAgent",
      "gen_ai.agent.id": "e6a83bcb669fe6bc",
      "gen_ai.request.model": "gpt-5-nano",
      "gen_ai.operation.name": "chat",
      "gen_ai.response.model": "gpt-5-nano-2025-08-07",
      "gen_ai.usage.input_tokens": 640,
      "gen_ai.usage.output_tokens": 1057,
      "gen_ai.request.temperature": 0.0,
      "gen_ai.request.max_tokens": 4096,
      "gen_ai.response.finish_reasons": ["stop"],
      "gen_ai.workflow.name": "ReActAgent",
      "gen_ai.output.messages": "[{\"role\": \"assistant\", \"parts\": [{\"type\": \"text\", \"content\": \"Thought: I can answer without using any more tools. I'll use the user's language to answer\\nAnswer: 4\"}], \"finish_reason\": \"stop\"}]"
    }
  },
  {
    "name": "invoke_agent agent.Agent",
    "trace_id": "45490d2d5facbaf466c13cb10f004a24",
    "span_id": "df17054265101d91",
    "parent_span_id": "e6a83bcb669fe6bc",
    "attributes": {
      "gen_ai.agent.id": "df17054265101d91",
      "gen_ai.framework": "llamaindex",
      "gen_ai.input.messages": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"What is 2 + 2?\"}]}]",
      "gen_ai.evaluation.sampled": true,
      "gen_ai.evaluation.error": "None",
      "gen_ai.output.messages": "[{\"role\": \"assistant\", \"parts\": [{\"type\": \"text\", \"content\": \"Thought: I can answer without using any more tools. I'll use the user's language to answer\\nAnswer: 4\"}]}]",
      "gen_ai.agent.name": "Agent",
      "gen_ai.operation.name": "invoke_agent",
      "gen_ai.workflow.name": "ReActAgent"
    }
  },
  {
    "name": "invoke_agent agent.ReActAgent",
    "trace_id": "45490d2d5facbaf466c13cb10f004a24",
    "span_id": "e6a83bcb669fe6bc",
    "parent_span_id": "97d102738181e22b",
    "attributes": {
      "gen_ai.agent.id": "e6a83bcb669fe6bc",
      "gen_ai.framework": "llamaindex",
      "gen_ai.input.messages": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"What is 2 + 2?\"}]}]",
      "gen_ai.evaluation.sampled": true,
      "gen_ai.evaluation.error": "None",
      "gen_ai.output.messages": "[{\"role\": \"assistant\", \"parts\": [{\"type\": \"text\", \"content\": \"4\"}]}]",
      "gen_ai.agent.name": "ReActAgent",
      "gen_ai.operation.name": "invoke_agent",
      "gen_ai.workflow.name": "ReActAgent"
    }
  },
  {
    "name": "workflow ReActAgent",
    "trace_id": "45490d2d5facbaf466c13cb10f004a24",
    "span_id": "97d102738181e22b",
    "parent_span_id": null,
    "attributes": {
      "gen_ai.operation.name": "invoke_workflow",
      "gen_ai.workflow.name": "ReActAgent",
      "gen_ai.workflow.type": "llamaindex.workflow",
      "gen_ai.framework": "llamaindex",
      "gen_ai.input.messages": "[{\"role\": \"user\", \"parts\": [{\"type\": \"text\", \"content\": \"What is 2 + 2?\"}]}]",
      "gen_ai.evaluation.sampled": true,
      "gen_ai.evaluation.error": "None",
      "gen_ai.output.messages": "[{\"role\": \"assistant\", \"parts\": [{\"type\": \"text\", \"content\": \"4\"}]}]",
      "gen_ai.conversation_root": true
    }
  }
]

Sample circuit app with async calls :
https://shw-playground.signalfx.com/#/apm/traces/f5ed88620ec54e8f153d131f8898b259?matchers=%5B%7B%22key%22:%22Service%22,%22values%22:%5B%22gpt-5-nano%22%5D,%22isNot%22:false%7D%5D

…gent flows

In async agent flows (ReActAgent using astream_chat), LlamaIndex stores tool
definitions on the Agent object rather than in the LLM callback payload. The
ContextVar tracking the current agent may not propagate across asyncio.Task
boundaries, so multiple fallback strategies are needed.

Changes:
- Extract tool_definitions from agent context with fallback chain:
  serialized payload -> parent agent ContextVar -> find_agent_with_tools()
- Extract response_model from raw response with fallback to request model
- Extract finish_reasons from raw response choices
- Add max_tokens fallback chain (serialized -> metadata -> Settings.llm)
- Detect provider from LLM class name
- Register agent + tools in wrap_agent_run() before wrapped() is called
- Add find_agent_with_tools() to search shared dict when ContextVar fails

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@shuningc shuningc requested review from a team as code owners April 20, 2026 07:36
@shuningc shuningc merged commit b790d04 into main Apr 23, 2026
14 checks passed
@shuningc shuningc deleted the HYBIM-620-async-fallbacks branch April 23, 2026 00:59
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 23, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants